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ABSTRACT 

We have used recent X-ray and optical data in order to impose some constraints on 
the cosmology and cluster scaling relations. 

Generically two kind of hypotheses define our model. First we consider that the clus- 
ter population is well described by the standard Press-Schechter (PS) formalism, and 
second, these clusters are assumed to follow scaling relations with mass: Temperature- 
Mass (T - M) and X-ray Luminosity-Mass (L x - M). 

In contrast with many other authors we do not assume specific scaling relations to 
model cluster properties such as the usual T — M virial relation or an observational 
determination of the L x — T relation. Instead we consider general unconstrained pa- 
rameter scaling relations. 

With the previous model (PS plus scalings) we fit our free parameters to several X-ray 
and optical data sets with the advantage over preceding works, that we consider all 
the data sets at the same time. This prevents us from being inconsistent with some of 
the available observations. Among other interesting conclusions, we find that only low- 
density universes are compatible with all the data considered and that the degeneracy 
between J7 m and as is broken. Also we obtain interesting limits on the parameters 
characterizing the scaling relations. 

Key words: galaxies:clusters:general, cosmology:observations 



1 INTRODUCTION 

In recent years, the quality and quantity of new data sets 
coming from several X-ray missions allow a more precise 
study of the properties of galaxy clusters. These data, to- 
gether with optical data sets have allowed many authors to 
compare the predictions of different models with observa- 
tions. 

The standard approach is to simulate the data for a given 
parameter dependent model and then by using an estimator 
(likelihood, x 2 , etc) look for the best fitting model. That is, 
the best parameter combination which best fits the data. 
In this process usually several assumptions are made. The 
most usual is that concerning the cluster population. Nor- 
mally it is assumed that the cluster population is well de- 
scribed by the Press-Schechter (PS) formalism (Press & 
Schechter 1974). This approach is supported by N-body nu- 
merical simulations which do show a good agreement with 
the PS parameterization (Efstathiou et al. 1988; White et 
al. 1993; Lacey & Cole 1994; Borgani et al. 1999). 



Another assumption usually made is the scaling of the tem- 
perature of the cluster with its mass, the T — M relation, 
which is taken as the virial relation (Eke et al. 1996). A 
T — M relation is necessary, for instance to build the temper- 
ature function of clusters (see section H). However, it is not 
clear to what extent the virial assumption is true for clusters, 
especially for those at high redshift. Several works show that 
the relation between mass and temperature has an exponent 
close or equal to the virial exponent M oc T? (Evrard et al. 
1996; Horner et al. 1999; Neumann & Arnaud 1999). How- 
ever, the isothermal /3-model and X-ray surface brightness 
deprojection masses follow a steeper M oc y 18-20 scaling 
(Horner et al. 1999). 

There are other scaling relations which are not well under- 
stood in the sense that they depend on the data used to build 
those relations and also on the method used to fit the data. A 
good example of this point is the Luminosity- Temperature 
relation (L x — T). From the literature one can find scal- 
ing relations ranging from L x oc T 2 ' 6 (Markevitch 1998) to 
L x oc T 3 ' 3 (David et al. 1993) while the most common one 
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is L x oc T 2 - 9 (White et al. 1997; Arnaud & Evrard 1999; 
Reichart et al. 1999). They show a discrepancy in the expo- 
nent of the relation. More and better data will be needed to 
resolve that discrepancy. Fabian et al. (1994) noted that this 
scatter is mostly due to clusters with strong cooling flows. 
See also White et al. (1997) for a good discussion about 
the effect of cooling flows. Also the method used to fit the 
L x — T data can explain part of this scatter. Conventional 
least-squares regression analysis assumes the abscissae data 
have zero error. This problem is overcome, for instance, by 
the use of an algorithm that takes into account errors in 
both dimensions of the data (White et al. 1997) . 

At present, X-ray observations are the best available data 
to study clusters. The amount of available X-ray data is in- 
creasing fast and in the near future larger data sets will be 
available. The strong X-ray emission from the hot gas in 
the intracluster medium makes the X-ray surveys an ideal 
way to detect clusters of galaxies. New catalogues of clus- 
ters have been published in the last years with the advantage 
that they are X-ray selected, and new ones are in prepara- 
tion. 



ences. First, in our model we will allow a large number of 
free parameters (9) instead of the one or two free parame- 
ter models usually assumed. This will prevent us from doing 
wrong assumptions about the scalings T — M or L x — M 
which could affect the final conclusion. Our second differ- 
ence is that we will consider different data sets simultane- 
ously. This is an important point as we will show in section 
0, where we demonstrate how some models with a good fit 
to some data sets, are however inconsistent with others. 
The structure of the paper is the following. In section pj, we 
describe the different data sets which will be used in the fits, 
and in section H we describe the model used to fit the previ- 
ous data. In section m, we search for the best model fitting 
the different data sets and discuss the best model estimator. 
In section fel we discuss the main results and compare them 
with previous works. Finally, section H includes the main 
conclusions of this paper and some implications for future 
X-ray & CMB experiments. 

Trough this paper we assume Hq = 100ft km s _1 Mpc -1 . Al- 
though we work in h units, the previous assumption should 
be taken into account when comparing with other results. 



Clusters have been used to impose constraints on cosmology 
in several papers (Oukbir & Blanchard 1992; Lupino & Gioia 
1995; Eke et al. 1996; Donahue 1996; Kitayama & Suto 1997; 
Oukbir & Blanchard 1997; Mathiesen & Evrard 1998; Don- 
ahue & Voit 1999; and many others). Clusters are the largest 
gravitationally bound objects in the universe and represent 
the final stage of the peaks in the primordial matter distri- 
bution. Their distribution in the mass-redshift (M,z) space 
is the fingerprint of those primordial fluctuations. The clus- 
ter abundance and its evolution is an essential cosmological 
test. Their modelling only depends on cosmological param- 
eters and not on any cluster scaling relation like the T — Al 
or L x — T, thereby allowing a more precise determination of 
the cosmological parameters independently of any assump- 
tion about the cluster scaling relations. For this reason many 
authors have tried to determine the cluster mass distribution 
as a function of redshift, the mass function (Bahcall & Cen 
1993; Biviano et al. 1993; Girardi et al. 1998). These authors 
found many difficulties when they tried a direct determina- 
tion of the mass function. Basically, the problem is that the 
mass estimators are usually based on different assumptions 
(spherical symmetry, virialization, hydrostatic equilibrium). 
Lensing determinations work pretty well but the number of 
clusters with mass determined by this technique is too small 
to build a mass function from them. 

An improvement could be to compare the models with the 
data using other X-ray derived functions (luminosity, flux, 
temperature) . The advantage of using X-ray data is that the 
determination of the luminosity, flux or temperature of the 
clusters is in general less affected by sy errors than the usual 
mass determination based on radial velocities of galaxies. 

In this paper we want to extract some information about 
clusters and cosmological parameters from cluster data. Our 
aim is to find a model (PS plus scalings) which fits different 
observational data. This model will be realistic in the sense 
that it describes present observations (mass, temperature, 
and X-ray luminosity and flux functions). 
This work follows many others but with two main differ- 



2 THE DATA 

In this work we have compared our model (Press-Schechter 
and T — M and L x — M) with five different data sets. 
dN(M)/dM (Bahcall & Cen 1993), dN(M,z)/dM (Bah- 
call & Fan 1998), dN(L x )/dL x (Ebeling et al. 1997) , 
dN(S x )/dS x (Rosati et al. 1998; de Grandi et al. 1999), and 
dN{T)/dT (Henry & Arnaud 1991). 

The first one is the mass function given in Bahcall & Cen 
(1993) which is built from a compilation of optical data of 
nearby clusters (z < 0.1). These data have several uncertain- 
ties mainly due to the poor precision in the determination 
of cluster masses. They estimated the masses through the 
richness and velocity dispersion of the clusters. More sophis- 
ticated methods, as lensing estimation would be preferable 
in order to achieve a good mass function but unfortunately 
the number of clusters with masses estimated from gravita- 
tional lensing is too small. It is important to bear in mind 
that masses in Bahcall & Cen (1993) where obtained from 
proportionality laws between cluster richness and mass or 
velocity dispersions and mass. Therefore, these masses esti- 
mates should be considered as inferred masses and not as a 
direct measure. There are other more recent determinations 
of the mass function (Girardi et al. 1998) but they suffer 
from the same problems. From the theoretical point of view, 
the mass function has the advantage of depending only on 
the cosmological parameters and not on the parameters in 
the T — M or L x — M relations. Therefore the mass function 
is very useful to constrain the cosmological parameters. 
We would like to point out that the original mass function 
given in Bahcall & Cen (1993) is a cumulative mass function. 
We have computed the differential mass function from the 
previous one by computing the difference between consecu- 
tive bins and the corresponding error bars are build from the 
original ones by adding them quadratically. Also important 
is to note that in Bahcall & Cen (1993) masses are esti- 
mated within a radius of 1.5/i -1 Mpc. Our masses however 
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are estimated within the virial radius. As a first approxi- 
mation we will consider that the mass within a sphere of 
1.5h~ Mpc centered on the cluster and the virial mass are 
equivalent. This is justified because virial radius can be well 
approximated by r v = 1.3M 1 ' (1 + z)hT x Mpc which, for 
typical clusters, is of the order of 1 h~ 1 Mpc. Clusters with 
masses M < 1.5 x 10 h~ M© will have virial radius below 
1.5/i -1 Mpc. In our model we have considered a truncated 
cluster density profile beyond the virial radius. Therefore, 
the previous clusters will have the same masses for larger 
radii (1.5h~ 1 Mpc). Some problems could arise with very 
massive clusters with M > 1.5 x 10 15 /i _1 Mq but these ones 
are very rare and the correction factor will be in any case 
small. 

In order to account for the evolutionary effects in the mass 
function, we have also considered another data set: the evo- 
lution of the mass function for massive clusters (Bahcall & 
Fan 1998). In this data set, the error bars are large but the 
data are good enough to constrain the cosmology even more. 
Bahcall & Fan have demonstrated that combining the two 
data sets can impose strong constraints on Q m and a&. 
Obviously the best models found by Bahcall & Fan should 
be compatible with other data sets but we have shown that 
this point is not true in general. If we take models with a 
good fit in both, the mass function and the evolution of the 
mass function, we have found that only a few of those models 
have also a good fit in other different data sets (for example 
the luminosity function, see fig. hj). This is the main rea- 
son why we decided to work with several presently available 
cluster data sets at the same time. We looked for the model 
that simultaneously fits the different data sets the best. The 
additional data sets came from X-ray observations. 
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Figure 1. An example of a bad model. The fit is good in the 
case of the mass and temperature functions but this model does 
not reproduce the other two curves. This model has the following 
typical values for the parameters whi ch a re commonly used in 
the literature (see text in subsection 3.2 and section for an 
explanation of these parameters and a discussion of their values): 
a 8 = 0.8, T = 0.2, fi m = 0.3, (A = 0),T = 1.0 X W s h a K,a = 
2/3, ip = 1.0, L = 1.0 X 10 45 h' 3 h- 2 erg/s, 7 = 2.9, <f> = 3.0 



Several X-ray cluster catalogues have been published re- 
cently (Rosati et al. 1995; Burke et al. 1997; Collins et al. 
1997; Scharf et al. 1997; Ebeling et al. 1998; Vikhlinin et 
al. 1998; de Grandi et al. 1999; Voges et al. 1999; Romer et 
al. 2000 and references therein). Some of these catalogues 
are deeper in flux than others and they have different sky 
coverages. The techniques to detect the clusters are also dif- 
ferent (wavelets, Vikhlinin et al. 1998; Voronoi tesselation 
and percolation, Ebeling & Wiedenmann 1993), but they 
show a remarkable agreement in the results. Particularly re- 
markable is the good agreement in the luminosity function 
anrong all those works, showing that the estimation of the 
luminosity function is a robust indicator of the cluster pop- 
ulation and this function will be very useful in the process 
of fitting our model. For the luminosity function we have 
used the estimation of Ebeling et al. 1997. This luminosity 
function is built from a ROSAT 90% flux-complete sample of 
~ 200 bright clusters (Brightest Cluster Sample, BCS) in the 
northern hemisphere at high galactic latitudes (\8\ > 20°), 
with measured redshifts z < 0.3 and fluxes higher than 
4.4 X 10" 12 er3cm" 2 s _1 in the 0.1-2.4 KeV band. Different 
determinations of the luminosity function have been given in 
the literature (Burke et al. 1997; Rosati et al. 1998; Vikhlinin 
et al. 1998), being all of them compatible with that in Ebel- 
ing et al. (1997). We would like to point out that this curve 
is given for an Einstein-de-Sitter Universe with go = 0.5. In 
order to build the luminosity function is necessary to assume 
a cosmological model for the computation of the luminosity 



distance and the comoving volume. We have checked the ef- 
fect of changing Q. m in this function. We have seen that the 
effect is negligible when we are dealing with redshifts below 
0.3 as in this case. For higher redshift data, the effect is still 
small as it can be appreciated from fig. (hi) of Bahcall & Fan 
(1998) where the authors show the data for three different 
models. 

Furthermore there are other functions that can be used as 
a test of our model. In particular, the flux function is rela- 
tively well established (there is only a small scatter among 
the different author estimations). The main difference be- 
tween these two functions is the redshift and cosmological 
model assumed. The flux function is a direct measure in the 
sense that this function does not contain any information 
about the distance (redshift plus cosmological model) from 
which the cluster is emitting. On the contrary, the luminos- 
ity function contains this additional information (redshift 
plus cosmological model). Both functions are obviously con- 
nected by the assumed model. 

For the flux function we used the one given by Rosati et 
al. (1998) for low-flux clusters and for the bright part of 
the curve we used the function of De Grandi et al. (1999). 
The sample of Rosati et al. (1998) (ROSAT Deep Clus- 
ter Survey, RDCS) is over the redshift range 0.05 — 0.8 
and is a complete flux-limited subsample of 70 galaxy clus- 
ters, representing the brightest half of the full sample, 
which have been spectroscopically identified down to the 
flux limit 4 x 10" 14 ergcm" 2 s _1 in the 0.5-2.0 KeV band. 
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In the RDCS sample, the sky coverage is small (48 deg 2 ) 
meanwhile the sample of de Grandi et al. has a larger 
sky coverage (8235 deg ) but the limiting flux is higher 
(~ 3.5 x 10" 12 ergcm- 2 s- 1 in the 0.5-2.0 KeV band) and 
therefore the sample is shallower (z < 0.3) than the RDCS 
sample. 

The final curve we have used to constraint our model is the 
temperature function. Henry & Arnaud (1991) compiled a 
temperature function from a sample of 25 nearby clusters. 
Their sample is X-ray selected and comes from Lahav et al. 
(1989) subject to the additional restrictions that the flux 
in the 2-10 KeV must be > 3 x 10 _11 erpcm _2 s _1 and the 
galactic latitude {\b n \ > 20°) (see Piccinotti et al. 1982). 
The sample is greater than 90% complete and redshifts range 
between z = 0.0036 and z = 0.09. 

The temperature function of Henry & Arnaud (1991) is 
known to suffer from some errors (Eke et al. 1996, Marke- 
vitch 1998, Henry 2000) but as mentioned in Eke et al 
(1998), and Henry (2000) the errors in the Henry & Arnaud 
(1991) temperature function are largely compensated. The 
temperature function is usually presented in integral form. 
A determination of the differential temperature function re- 
quires binning the data and performing an average over the 
objects in the bin. This procedure introduces some arbitrari- 
ness that the integral form avoids. However, due to the fact 
that our method is based on \ 2 quantities we need the tem- 
perature function in a differential form. The arbitrariness 
of this binned function could be reduced significantly by in- 
creasing the number of clusters with measured temperature. 
However, there are few clusters for which we know precisely 
their temperature and consequently the differential temper- 
ature function is poorly determined. In order to check the 
validity of the Henry & Arnaud (1991) temperature function 
with more recent data we computed a binned version of the 
temperature function using the Henry (2000) data. Our es- 
timate of the differential temperature function showed to be 
in good agreement, within the error bars, with the previous 
estimate of Henry & Arnaud (1991). Due to this agreement 
and to the large error bars of this function, our results will 
not depend significantly on the choice of one or another tem- 
perature function. 

Although the temperature function is affected by large error 
bars, however its use is justified because as a difference with 
the luminosity or flux functions, only the T — M relation is 
needed to build the temperature function. To compute the 
theoretical luminosity and flux functions from the PS for- 
malism, the L x — M and T — M relations are needed. The 
first one is used to obtain the bolometric luminosities from 
the mass and the second is required to obtain the luminosi- 
ties in the observed band. Hence, the temperature function 
is less affected by the uncertainties in the cluster scaling 
relations than the luminosity and flux functions. A recent 
determination of the temperature function can be found in 
Blanchard et al. (2000) and Henry (2000). Their determina- 
tion of this function is compatible with the one in Henry & 
Arnaud (1991) for temperatures > 3 KeV. 
The information about the redshift and sky coverages, lim- 
iting flux, and the energy band in which luminosities and 
fluxes are given is needed in order to correctly simulate the 
data following the characteristics of the observations. The 
total number of clusters, and thereby, the error bars, will 



depend on the redshift and sky coverages and also on the 
limiting fluxes. The shape of the functions will depend on the 
limiting flux because lowering the limiting fluxes less mas- 
sive and more distant clusters will be selected. Energy band 
and K corrections must also be included in order to correct 
for the bolometric luminosity. Finally the cluster number 
densities are based on the computation of the V m ax which is 
the maximum volume in which the cluster could have been 
and still remained in the sample. Therefore these volumes 
will depend on the the limiting flux (see Page & Carrera 
2000 for a good discussion about the 1/V method). 
All those observational features will be considered to per- 
form a bias test using Monte Carlo simulations of the mod- 
els in section kl 

These data sets are not completely independent. Some clus- 
ters are common to the different catalogs and one should 
consider the dependence between the data but it can be 
shown that the dependence is not very significant. The lu- 
minosities and fluxes are independent because to compute 
the luminosity from the flux the redshift is needed. Because 
the redshift is an independent variable with respect to the 
flux, then the luminosity should be also considered as inde- 
pendent with respect to the flux. The temperature is another 
independent quantity so we do not expect correlations be- 
tween this data set and the others. However, there is a clear 
correlation between the first data point in the evolution of 
the mass function and one point of the local mass func- 
tion. Indeed the information given by the comoving number 
density of clusters N(M > 8.0 x 10 14 /i _1 M©) at z = is 
contained in both data sets. Apart from this, we consider 
that the rest of our data points are in fact independent. 
The situation is different with the theoretical curves. The 
model will introduce some correlations among the curves, 
as we will see in the next section. 



3 THE MODEL 

3.1 The Press-Schechter formalism 

As in previous works the starting point of our model is 
the mass function which contains the information about 
how many clusters are at a given redshift and how massive 
they are. We adopt the standard Press-Schechter formalism 
(Press & Schechter 1974) which has shown to be very consis- 
tent with N-body simulations (Lacey & Cole 1994; Borgani 
et al. 1999). 

In this formalism the cluster number density per unit mass 
as a function of mass and redshift is given by: 



dN(M, z) 
dV(z)dM 



2_p_ 

TT M 2 



Sco{z) 



CM 



dlog/TM 



dlogM 



■ 



(1) 



where p is the present day average matter density p — Q m x 
2.7755 x 10 11 h 2 M Q Mpc~ 3 and 8 co (z) is the linear theory 
overdensity extrapolated at the present time for a uniform 
spherical fluctuation collapsed at redshift z. 
For a fl m = 1 model we have used S co (z) — 1.6865(1 + z) 
and for fi m < 1 we take S co (z) — D \X 5 c (z) where D(z) is 
the linear growth factor (Peebles 1980) and : 
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5 c (z) = ^D(z) 



2tt 



sinh(rj) — r\ 



2/3N 



(2) 



for an open A = model and : 

5c(z) = 1.6866[1 + 0M256logi £l m (z)], (3) 

for a flat ACDM model (see Kitayama & Suto 1996, Math- 
iesen & Evrard 1998 for details). 

<jm is the rms of the density fluctuation at the mass scale 
M which is related with the power spectrum of density fluc- 
tuations P(k) through : 



2 1 



dkk 2 P(k)W 2 (kR), 



(4) 



where the window function W(kR) is introduced in order to 
select the volume from which the object with mass M will be 
formed. We have used the standard top hat approach for the 
window function and the corresponding Fourier transform is 
in this case: W(kR) = 3(sin(kR) - (kR)cos(kR))/(kR) 3 . R 
is the comoving scale corresponding to the mass M and the 
relation between both quantities is : M — piivR 3 . 
For the power spectrum we have used the following param- 
eterization, 



P(k) = Aolk n T 2 (k). 



(5) 



The amplitude A is computed from equation (4) just taking 
in that equation the mass corresponding to R — 8 /i _1 Mpc 
and eliminating from both sides of the equality the param- 
eter erg. n is the primordial power spectrum. We fixed this 
parameter to the Harrison- Zeldovich case n = 1 according 
to determinations from CMB data (COBE-DMR Bennet et 
al. 1996; MAXIMA Balbi et al. 2000), and finally T(k) is the 
transfer function. For the transfer function we used the fit 
given in Bardeen et al. (1986) for an adiabatic CDM model: 

ln(l + 2.: 



T(k) = 



(6) 



2.34<j 

[1 + 3.89g + (16.1q) 2 + (5.46g) 3 + (6.71g) 4 ]" 1/4 , 

where q — k(hMpc~ )/T, being V the shape parameter of 
the power spectrum. For the case of a CDM model with 
negligible Qb, then T ~ fl m h. We have considered as an 
additional constraint in our calculations the following. Al- 
though all our data sets and quantities are h independent 
(everything is in h units), however we have just considered 
those models for which the ratio V/il m is between the con- 
servative limits 0.5 < h < 0.75, thus avoiding to compute 
CDM models which could be inconsistent with recent deter- 
minations of h. 

In the previous formalism, there are two main variables: 
the mass and redshift of the cluster. Therefore, the Press- 
Schechter mass function which predicts the density of clus- 
ters expected at a given redshift and mass can be consid- 
ered as the probability distribution of clusters in the mass- 
redshift space (M-z) by normalizing by the total number. 
The cosmological parameters in this formalism are basically 
three, the density of the universe, Q m , the amplitude of the 
power spectrum which we parameterize in units of ag and 
finally the shape parameter of the power spectrum V. 
We can compare this model with real observations of the 
mass function and by doing this we can get some informa- 
tion about these three parameters. This has been done in 



several works (Bahcall & Cen 1993; Girardi et al. 1998) and 
the conclusions are very interesting. These works have shown 
for instance that low-density universes are more compatible 
with the observed mass function. 

However, there are some problems with these works. First, 
the quality of the data is not very good, mainly due to the 
fact that most of the masses have been estimated using ra- 
dial velocities of cluster galaxies. Second, the mass functions 
are built only for nearby clusters and these mass functions 
do not contain any information about the cluster abundance 
at high redshift. There are some attempts to estimate the 
evolution of the mass function with redshift and, although 
the error bars are very large, one can obtain very interesting 
constraints on the cosmological parameters using this evolu- 
tion (Bahcall & Fan 1998). This indicates that an accurate 
information of the cluster abundance at high redshift would 
be a very powerful technique to constrain the cosmology. 
Unfortunately the mass function of clusters at high redshift 
is not well determined yet but there are some other func- 
tions which can be used in addition to the mass function. 
Recent X-ray experiments (Einstein, ASCA, ROSAT) have 
determined the temperature, luminosity and flux for several 
hundreds of clusters, some of them at medium and high red- 
shift (up to z — 0.8 in the RDCS). This information can be 
used to build new functions similar to the mass function, 
based on the temperature, luminosity and flux of the clus- 
ters. For instance the expected temperature function up to 
a given redshift, dN/dT, (which can be compared with the 
corresponding observational temperature function), will be 
given by the integral along the redshift interval of: 

dN(T, z) _ dN(M, z) dM 

dV{z)dT ~ dV(z)dM~dT' () 

where dN(M, z) / dV (z)dM is the Press-Schechter mass func- 
tion. In order to build that function we need to calculate the 
derivative 4^ and hence a T — M relation is required. Usu- 
ally the virial relation is assumed; T oc M 2 ' 3 (l + z), though 
as discussed below, we will introduce free parameters to de- 
scribe this relation . To build the X-ray luminosity and flux 
functions we operate in the same way but in this case we 
need the relation between the mass and the X-ray luminosity 
of the cluster, the L x — M relation. There are few attempts 
to determine observationally the L x — M relation but the 
situation is different with the L x — T relation (David et al. 
1993; Markevitch 1999; Reichart et al. 1999). These works 
show that there is a scaling in this relation L x oc y 26-3 - 3 . 
The exponent of the scaling depends on whether or not clus- 
ters with cooling flows are considered, being the exponent 
higher when clusters with cooling flows enter the analysis. 
Another contribution to that scattering is that different sta- 
tistical methods have been used to analyze the data (White 
et al. 1997). 

Using the T — M relation and the L x — T scaling is possible 
to build an L x — M relation which can be used to construct 
the luminosity and flux functions. 



3.2 Cluster scaling relations 

Starting from the Press-Schechter mass function plus the T— 
M and L x — M relations, the idea of this work is, therefore, 
to build the mass function itself and the remaining curves: 
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temperature, X-ray luminosity and flux functions. We will 

compare these curves with the corresponding observational 

data sets and by changing our model parameters we will 

look for the best model simultaneously compatible with all 

the different data sets. 

So, all what we need to know are the T — M and L x — M 

relations. 

For the T — M relation, the most common model comes from 

the virial theorem plus the spherical collapse model and the 

isothermal gas distribution assumption (Eke et al. 1996): 



T gos ocM4,(l + z) 



(8) 



The shortcomings of this relation are well known (Eke et al. 
1996; Kitayama & Suto 1996; Viana & Liddle 1996; Voit & 
Donahue 1998). Basically the problem is that this assump- 
tion only holds for virialized objects. In the case of clusters 
this is more or less true for low redshift clusters where the 
equilibrium conditions required by the virial theorem are 
achieved. But we do not know what happens at high red- 
shift. Similar problems are in the redshift evolution of this 
relation. As discussed in Voit & Donahue (1998), the conse- 
quences of using an inaccurate T — M relation can be quite 
significant. For these reasons, we will consider this relation 
as an unconstrained one and we will adopt as the T — M 
relation the following, with no previous assumption about 
the parameters: 



r sos = T Mf 5 (H-2 



i* 



(9) 

where M15 is the cluster mass in /i _1 10 15 Mq. For the L x — M 
relation the situation is similar. The L x — M relation is not 
well established and we prefer to allow this relation to be a 
free parameter relation, 



L™ = L M? 5 (l + z) 



(10) 



Since T gas is in Kelvin and L x ° l in h~ 2 erg/(s cm 2 ) and 
considering the mass in fe _1 lO 15 M0, then an additional h a 
and hr must be introduced in To and Lo respectively in 
order to make our result /i-independent. 
From the previous L x — M relation it is possible to build the 
S x — M relation by simply considering, 



qBoI _ 

^ x — 



(11) 



T relation has the form: 



(12) 



AttDi(z) 2 ' 
In this formalism the L x 

e' = ^r(i + ^, 

where 7 = {3/a is the familiar exponent of the L x — T re- 
lation and 8 — <f> — ip/3/a. Within this framework we have 
a total of 9 free parameters: o"8,F, £l, n ,To,a,if},Lo,/3 and 
<f> (or equivalently we can use 7 = /3/a instead of /3, and 
5 — <f> — 7/17 instead of <j>) ■ We have also considered the two 
situations flat A = 1 — Q. m models (ACDM) and open A = 
models (OCDM). 

There are some experimental determinations of the param- 
eters in T — M and L x — M. For instance many works have 
shown that a is compatible with the predicted virial value 
a = 2/3 (Neumann & Arnaud 1999) but also possible are 
scaling exponents a ~ 0.5 (Horner et al. 1999; Nevalainen 
et al. 2000). The normalization of the T — M scaling has 
been determined by many authors and they found typical 
values of T ~ 1.0 x lO 8 ^ (Horner et al. 1999). There is not 



too much work done on the determination of the redshift 
exponent i/> because the data and redshift coverage is poor 
to fit this exponent, but usually what is found is that this 
exponent is also compatible with the virial prediction ip ~ 1 
(Neumann & Arnaud 1999). On the L x — M relation the 
scatter in the data is too large (large error bars in mass) but 
the situation gets better when the L x — T relation is instead 
considered. In the latter case, the scatter in the correlation 
is reduced. Typical values for the parameters in these re- 
lations are L ~ 1.0 x 10 45 h~ 2 erg/s , 7 ~ 2.9 (Arnaud & 
Evrard 1999) and 5 ~ (Borgani et al. 1999; Reichart et al. 
1999; Fairley et al. 2000) although the uncertainty in this 
last parameter is large. From the relation between <5 and <f> 
is easy to infer that <f) « 3 is what it is expected when ip = 1 
and 7 ~ 3. In fig. (1) the model was chosen according to 
these typical values. From L x — T and T — M is easy to infer 
the parameters in L x — M and vice-versa. 
In our fit, we have allowed the parameters to take differ- 
ent values around these observational and theoretical pre- 
dictions. 



We are now ready to build the theoretical five curves 
dN(M)/dM, dN(M,z)/dM, dN(L x )/dL x , dN(S x )/dS x , 
and dN(T)/dT and to look for the best model by comparing 
these curves with the data. 

Similar analysis have been presented in previous works. 
However, we would like to remark again that in those works 
either some parameters are fixed (in T — M or L x — M) or 
only one data set is used (e.g. dN(M)/dM, dN(T)/dT, etc). 
In Mathiesen & Evrard (1998), the authors combined a free 
parameter L x — M relation and two data sets (dN(L)/dL, 
and dN(S)/dS) in order to say something about the evolu- 
tion of the L x — T relation. However, they fixed the T — M 
relation and they did not combine together the results com- 
ing from the two different data sets. A similar work was 
done in Borgani et al. (1999) where the authors have used 
the observables, flux number counts, redshift distribution 
and X-ray luminosity function over a large redshift baseline 
(z < 0.8) of the RDCS in order to constrain cosmological 
models. In the same paper, no assumption is made a priori 
on the L x — M relation, except for the amplitude of this re- 
lation which is fixed by the authors. In addition the T — M 
relation is fixed to the usual spherical collapse plus virial 
plus isothermal gas distribution model. 

In Bridle et al. (1999) they have combined the X-ray cluster 
temperature function (Henry & Arnaud 1991, Henry 2000) 
with CMB data and the IRAS 1.2 Jy galaxy redshift survey, 
but they have assumed a fixed T — M relation. This latter 
point can affect the final result. 

Up to now, no previous work has combined such a large 
number of data sets as the five ones we have used without 
including any assumptions about the normalization or spe- 
cific scalings of the temperature or X-ray luminosity. 

As we mentioned at the end of the previous section, the 
model we have assumed will introduce some correlations be- 
tween the 5 theoretical curves. Just by looking to equation 
(0), it is clear that the temperature function is correlated 
with the mass function (equivalently for the luminosity and 
flux functions). This point should be taken into account 
when fitting the data. 
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4 STATISTICAL ANALYSIS AND RESULTS 

In order to fit the five data sets we must decide which es- 
timator we should use. Because we assume there are some 
scaling relations between mass and temperature (T — M) 
and luminosity (L x — M) in the X-ray band, then, there 
must be some correlations among the five simulated data 
sets. Therefore we should start by considering an estimator 
like the standard likelihood estimator which takes into ac- 
count all the correlations into the correlation matrix M. In 
our case, the model depends on 9 free parameters and if we 
consider a grid of, let's say 5 values per parameter, then we 
should compute the correlation matrix for 5 9 ~ 1 million dif- 
ferent models. This process would take many years. A faster 
technique would require a search method that avoids to ex- 
plore all the parameter space. This could be the technique if 
we were interested just in the best model but we also want 
to know the error bars, or in other words the probability 
distribution of the parameters. To do that we need to know 
the probability in a given regular grid. 

To simplify the problem, the most simple approach is to 
consider the standard Xjoint as our estimator: 

X joint = Xm + xli(z) +Xt + Xl x + Xs x , (13) 

where Xi represents the corresponding ordinary \ 2 f° r the 
five different data sets and we are assuming that the corre- 
lation matrix is in this case diagonal. 

By doing this, we know that we are forgetting the correla- 
tions between the curves and that there will be some bias 
in our estimation. For this reason, we want to check other 
more elaborated estimators. 

We have considered as a second estimator of the best model 
one based on Bayesian theory (Lahav et al. 1999); 



2lnP L = X l, 



where, 



5 

x i = J2 mn ^ 



(14) 



(15) 



In this estimator, the Xi ls again the ordinary \ for each 
data set and Ni represents the number of data points for the 
data set i. Based on a Bayesian approach with the choice 
of non-informative uniform priors on the log, those authors 
have seen that this estimator is appropriate for the case 
when different data sets are combined together, as is our 
case. The factor N plays the role of a weight factor. Larger 
data sets are considered more reliable for the parameter de- 
termination. 

We have checked both estimators by performing a bias test. 
In this test we have simulated the five data sets for a con- 
crete model with the corresponding error bars similarly as 
they were computed in the real data. The input model was 
selected according to the criterion that it would be as close as 
possible to the data (for instance the model which minimizes 
Xjoint)- in the simulations, we have taken into account all 
the characteristics of the data, that is, sky coverage, limiting 
flux, maximum redshift, etc. Then we compare each one of 
these realizations corresponding to the assumed model with 
the models previously computed in the grid and for each re- 
alization we get the best-fitting model to the simulated data 
using both estimators. 
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Figure 2. The histogram represents the number of times each 
parameter was considered as part of the best model by the stan- 
dard X 2 i nt (dotted) and the \% (solid). The black dot represents 
the input model. 



Table 1. Best ACDM and OCDM models (i/> fixed to 1.0). Error 
bars represent the projection of the contour at the 68 % confidence 
level of the 8-dim probability on each of the parameters. Limits 
marked with (*) must be considered as lower limits because the 
parameter was not explored above that limit. 
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In fig. bl we plot the number of times each parameter was 
considered as part of the best model by the first and second 
estimator. The dot represents the input model. As it can be 
seen from the histograms the second estimator xi works a 
bit better than the standard Xjoint- There is still some bias 
but the agreement between the input model and the recov- 
ered peak of the distribution is very good. 
We can get some interesting information from these plots. 
The dispersion of the histograms indicates how sensitive is 
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the estimator to that parameter. For instance, the cosmo- 
logical parameters are well constrained. This is not the case 
for the redshift exponent ip. We fixed this parameter to the 
virial value ip — 1 because our method is not sensitive to 
that exponent. When changing this exponent the simulated 
curves did not change appreciably, showing the almost null 
dependence of the simulated curves to this parameter. There 
is an explanation to that. This exponent appears only in 
the T — M relation as the redshift exponent. This relation 
is needed to construct the temperature function and these 
data goes only up to redshift ~ 0.1. Therefore, it is not sur- 
prising that we can not get any significant result about the 
redshift dependence with these data. The T — M relation 
appears also in the calculation of the X-ray luminosity in a 
given band, so the exponent ip would be in principle impor- 
tant when we are simulating clusters at high z to compare 
the flux function with the data of RDCS since this data 
to z ~ 0.8. The flux in the band used by Rosati et al. 



(1998) is calculated from the luminosity in that band (see 
eq. Ill]) and Lband is computed from Lsoi in the following 
way : L band — L Bo i fband where f band includes the band and 
K corrections and is usually well approximated by the in- 
tegral of the frequency dependence of the Bremsstrahlung 



emission : fband 



e K b T dE. Emi n and E„ 



i r«> 

K bT JE„„„ 

are the energy limits of the band, and T the cluster tem- 
perature. The redshift dependence of fband is concentrated 
in the K-correction, and there is a weak dependence also on 
the redshift exponent of the T — M relation. This depen- 
dence is too weak to be able to impose some constraints on 
this exponent even when we are using data at medium-high 
redshift like the fluxes of clusters at z ~ 0.8 (Rosati et al. 
1998). This explains the reason why with these data we can 
not say much about the exponent ip. We decided to fix this 
parameter to the standard value ip = 1, therefore reducing 
our dimension in the parameter space from 9 parameters to 
8. However, this parameter should be considered as a free 
parameter when dealing with future data for which the red- 
shift coverage will increase significantly. 
Other result from the bias test is that there is some bias 
in the parameter a (scaling exponent of the T-M relation) . 
The bias is about 0.05 or more towards higher values of a. 
We will come later to this point. A similar bias is found in 
Lo (about 0.03 to higher values). The bias is not too large 
considering the small bin interval but anyway it must be 
taken into account. 

Apart from these parameters, the second estimator seems to 
be a good indicator of the best model. 



The next step is to compute the probability distribution in 
our 9 dimension parameter space (8 after fixing ip), using the 
second estimator. We have used a grid with about 2 million 
different models in the two cases flat ACDM and OCDM 
and for each of them we have computed its Pl (eq. FL4I) . 
In fig. H, we plotted the best model compared with four data 
curves used in the fit. It is important at this point to com- 
pare figure 1 and figure 3. Both cases only differ slightly on 
the cluster scaling relations but the differences in the models 
are relevant, specially in the case of the luminosity and flux 
functions. This shows the sensitivity of the models to the 
cluster scaling relations. Small changes in the parameters of 
these scalings can produce a completely different function if 



all the changes imply variations for the function in the same 
direction. The best models listed in table 1 are an example 
of a fine tunning between the parameters. One change in 
one parameter should be compensated by another change in 
other(s) parameter(s) in order to keep the model compatible 
with the data and only a small region of the parameter space 
is allowed. This also explains why the temperature function 
does not change significantly. While in the luminosity and 
flux functions both scaling relations (T — M, and L x — M) 
are needed, in the case of the temperature function only the 
T — M relation is required, thereby reducing the number of 
parameters and consequently the change in the temperature 
function when a variation in the whole set of parameters is 
performed. 

In fig. W, the best model is compared with the fifth curve. 
There is a good agreement between our best-fitting model 
and all the data sets except the fifth one where the model 
predicts less comoving number densities at high z than ob- 
served (only 2 clusters in the z ~ 0.54 bin and 1 in the 
z ~ 0.8 bin). However, one should bear in mind that in the 
fifth curve there are only three data points and also these 
data points have large error bars and therefore the weight of 
the fifth curve in the Lahav et al.'s estimator (see eq. |lE| ) is 
low compared with the weight of the other data sets. When 
considering the band corresponding to the 68% confidence 
region of the cosmological parameters, it overlaps the data 
within the 68% error bars. 

On the other hand, the dN(M, z)/dM curve is useful in the 
sense that including this curve in the analysis, helps to break 
the degeneracy between as and Q, (as we will show in the 
next section). 

Obviously, this point suggests the need of getting better 
quality data in the evolution of the mass function in or- 
der to make these data a decisive discriminator between the 
models. 



5 DISCUSSION 

We have computed the marginalized probability of the pa- 
rameters in order to see how well constrained are those pa- 
rameters. In fig. H, we show the power of the method to 
constrain the cosmology, even the amplitudes of the T — M 
and L x — M relations are well constrained. As seen in the 
bias test, it is clear that we can not say much about the 
exponents a and 7, except that high values are favored. 
Virial theory predicts a — 2/3 which is compatible (at 68 
%) with our fit values given in table hi. However, models 
with a = 0.8 work better than virial models, and maybe 
higher values could work even better. (We did not check 
this possibility because we wanted to remain within values 
of the parameters not far away from the expected ones). 
In Nevalainen et al. (2000), the authors found a ~ 0.55 
which is inconsistent with the self-similar (virial) prediction. 
They argue that a possible explanation for this discrepancy 
is preheating of intracluster gas by supernova-driven galac- 
tic winds before the clusters collapse, as proposed by e.g. 
David et al. (1991), Evrard & Henry (1991), Kaiser (1991) 
and Loewenstein & Mushotzky (1996). If supernovae release 
a similar amount of energy per unit gas mass in hot and cool 
clusters, the coolest clusters would be affected more signifi- 
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Figure 3. Expected curves compared with data for the best 
ACDM model (solid) and OCDM model (dotted). See table ^ 



Figure 5. Marginalized probability distributions for the 8 pa- 
rameters. Dotted lines for open CDM models and solid lines for 
fiat ACDM models. In both cases the ip parameter was fixed to 1 
(see text). 
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Figure 4. The best ACDM (dashed) and OCDM (dotted) mod- 
els compared with the fifth data set (evolution of the mass func- 
tion). Although the model is out the 68% error bars, however, 
although not represented, the model is inside the 95% error bars 
(see Bahcall & Fan (1998)). The shaded regions correspond to 
the 68% confidence region of the cosmological parameters (high 
dense shaded OCDM, and low dense shaded ACDM) 



cantly than the hottest ones. This increase in their temper- 
ature will change the slope in the T — M relation towards 
low a values. In the data sets we have considered, we have 
bright clusters with temperatures which are typically T > 3 
KeV. At those high temperatures, the previous effect should 
not be relevant and hence the slope in the T — M relation 
should approximate the self-similar value (2/3) (see fig. 2 in 
Nevalainen et al. 2000). This can explain how our results are 
more compatible with the virial prediction that with those 
empirical relations where cool clusters are included in the 
fit. 

A possible source of systematic errors in our best fitting val- 
ues (including a) can be on the data themselves. The data 
sets used in this work suffer from several systematics which 
can affect the best fitting parameters in the T — M relation. 
In our method, the best fitting T — M relation is obtained 
from a global fit of the model to all the data. If such data 
sets change in some way then the best fitting model should 
change as well. In the mass function, masses are defined in- 
side a fixed radius. A different choice of this fixed radius 
could produce a different estimate of the cluster mass func- 
tion. In the X ray flux and luminosity functions, the inferred 
fluxes and luminosities depend on the assumed cluster pro- 
file used to extrapolate the observed surface brightness pro- 
file (Vikhlinin et al. 1998). If masses, fluxes or luminosities 
are underestimated or overestimated, then we should expect 
some differences in the best fitting parameters and in par- 
ticular in a. 

These systematics will be reduced with future determina- 
tions of these quantities (M,T,L X ). Cluster mass estimates 
can be clearly improved using the lensing technique. On the 
other hand, on-going X ray missions (CHANDRA, Newton- 
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XMM) will be able to determine the cluster surface bright- 
ness profile at larger radii and with a higher quality for a 
significant number of clusters. Furthermore, from the bias 
test we know that in the a parameter there is some bias 
in the peak of the distribution, so we know that if we got 
q = 0.8, this high value compared to the virial one can be 
due to the bias in our estimator. However, our estimate of a 
is compatible (given the error bars) with the virial exponent. 
It could also be that hot clusters really behave in this way, 
showing a tendency towards high a exponents. In order to 
distinguish between the two possibilities, more and better 
quality data is needed. 

The second exponent, 7, is also pointing to high values. In 
this case we know, from the bias test, that this exponent 
is degenerated. This together with the error bars found can 
very well accommodate an exponent 7 ~ 2.9, which is the 
most frequent value obtained in the literature when fitting 
directly the L x —T relation. However, the direct estimate of 
L x — T suffers from large scattering and depending on the 
kind and number of clusters considered the results are quite 
different and high values for 7 should not be ruled out yet. 
For instance, in Borgani et al. (1999) they found 3 < 7 < 4 
when fitting a phenomenological L x —T relation plus PS to 
the local X-ray luminosity function. 

Concerning the redshift exponent <j>, we have a bit more in- 
formation compared with the null information we got in ip. 
This is not surprising because the L x — M relation appears 
in the calculation of dN(L)/dL and dN(S)/dS where the 
data is between 2 £ [0, 0.3] and 2 £ [0, 0.8] respectively, and 
these redshift intervals are much deeper than the one for the 
dN(T)/dT data. Although the best value differs for the two 
cosmologies considered, however the value <f> ~ 3 is allowed 
in both cases. Experimentally, there is no determination of 
the (j> parameter. What the different authors assume when 
they try to fit the L x — M relation to real data, is that there 
is no redshift dependence in this relation, that is, they sim- 
ply fit the relation L x — LqM^ . However, we have shown 
in section g, that the unobserved (f> parameter can be re- 
lated to the redshift exponent in the L x — T relation (eq. 
|l2j ), 8 = 4> — ipj an d using this relation we can infer the 
value of <f>. Typical values for 8 found in the literature are 
8 ~ (Fairley et al. 2000). In Borgani et al. (1999), the 
authors have shown that the L x — T relation is compatible 
with no evolution. This result is also consistent with that 
of Mushotzky & Scharf (1997) where they compared results 
from a sample of ASCA temperatures at 2 > 0.14 with the 
low redshift sample by David et al. (1993) and they found 
that data out to 2 ~ 0.4 are consistent with no evolution in 
L X ~T. 

Now if we assume tp = 1 (from virial models) and 7 ~ 3 
(from the empirical L x —T relation) then <j> should be <f> ~ 3 
in order to satisfy 8 ~ 0. So we can conclude that <f> ~ 3 is 
compatible with the virial assumption and also with 7 ~ 3. 

For a comparison of our results with a recent deter- 
mination of the L x — T relation see for instance Fairley 
et al. (2000). It is remarkable that in that paper the au- 
thors find 7 = 3.15, very close to our preferred value. Also 
they found an amplitude in the L x — T relation which 
is C = 6.04 ± 1.47 x 10 42 erg/s. This value should be 
compared with the amplitude Lq in our L x — M relation 
Lo ~ 1-0 x W 45 h"h~ 2 erg/s which corresponds to an am- 



plitude in L x - T (see eq. Q L /T ( ] = 6.25 x 10 42 erg/s 
(for 7 = 3, To = 1.0 x W 8 h a K and taking h = 0.5 which 
is the value used in Fairley et al. 2000). The normalization 
obtained here for the T — M relation is higher than those 
ones obtained from simulations or pure cluster modelling 
(spherical symmetry, virialization, hydrostatic equilibrium). 
This is not surprising as these kind of modelling does not 
include some physical processes relevant to cluster forma- 
tion and evolution. Our results should be compared with 
observational determinations of this relation like the ones in 
Horner et al. (1999) where they found values for the T — Al 
normalization compatibles with our estimate (see table 1 in 
Horner et al. 1999). 

It is important to point out that not all the parameter com- 
binations inside the error bars in table |l| correspond to mod- 
els which are simultaneously compatible with all the data 
sets. As we have shown in fig. 1, the model with parameters 
erg = 0.8, F = 0.2, n m = 0.3, (A = 0),T = 1.0 x 10 8 K, 
a = 2/3, ip = 1.0, L = 1.0 x X^hPhT 2 erg/s, 7 = 2.9, = 
3.0 is an example of a 'bad ' model in the sense that this 
model does not fit all the data sets. One should also notice 
that although these values are inside the error bars given 
in the table, since they are projected ones, not all the pos- 
sible combinations are allowed at the 68% confidence level. 
Therefore, when choosing a model it is important to bear in 
mind the correlations among the parameters. 
The method is really powerful in the determination of the 
cosmological parameters. We made a consistent fit to five 
different data sets and we got strong constraints on the cos- 
mological parameters. Independently of A, only low-density 
universes are compatible with the different data sets. The 
amplitude of the power spectrum is also well constrained. 
Its value is consistent with, for instance, the value obtained 
by Bridle et al. (1999) where they have combined cluster, 
plus CMB and IRAS data using the same Lahav et al.'s es- 
timator and they obtained <rg ~ 0.75 and fl m ~ 0.35. 
We have computed the marginalized probability in the 
(erg — ft m ) space in order to look for the well known <rg — fi m 
correlation (Eke et al. 1996; Carlberg et al. 1997; Henry 
1997; Kitayama & Suto 1997; Bahcall & Fan 1998; Bor- 
gani et al. 1999; Bridle et al. 1999). From the five data sets, 
the function dN(M,z)/dM shows a tendency to favor low- 
density models (Q < 0.2) whereas the others seem to favor 
slightly higher values of il. Although our grid is poor (inter- 
vals of 0.1 in erg and fl m ), we have seen that by combining 
the five data sets, there is a clear peak at the position cell 
(cr 8 = 0.8, ft m = 0.3) in both ACDM and OCDM models. 
Approximately 50% of the marginalized probability volume 
is enclosed in that 0.1 x 0.1 cell (see fig. H). 
This is showing that the degeneracy between these two pa- 
rameters can be broken by combining different data sets. 
From the 5 data sets considered in this work, the evolu- 
tion of the cluster population with redshift (Bahcall & Fan, 
1998) is, in principle, the most sensitive to the change in 
the cosmological parameters. However that data set suffers 
from large error bars due to the small number of clusters 
present at the high redshift bins. We made an additional 
test to check the weight of this data set in our fit. We have 
recomputed the marginalized probability in 57 — erg , excluding 
from the fit the Bahcall & Fan (1998) data set. The result 
is very simular to the one shown fig. q. This demonstrates 
that with only the low redshift data sets it is possible to 
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Figure 6. Marginalized probability in erg — Q m for the flat ACDM 
case (for OCDM the situation is similar). The probability distri- 
bution has been interpolated in order to smooth the surface. The 
contour shows the region at the 65% confidence level and the 
dotted curve corresponds to the correlation law: ag = 0.5O„ ' . 



break the degeneracies present when each one of the indi- 
vidual data sets is analyzed separately. 

The fit to the flat ACDM model was a bit better than the 
one to the open model in the sense that the best-fitting 
ACDM model had a smaller xi (76.1 compared with 76.8). 
In order to compare both cases in a more realistic way we 
performed the following statistical test. Using 500 simula- 
tions of the OCDM model, for each of them we got the best 
model given by the xi estimator applied to both situations 
(ACDM and OCDM models). The result was that 197 of the 
initial 500 OCDM simulations had a smaller xi in the flat 
model case and in the remaining 303 simulations the open 
case was preferred. This demonstrates that both cases are 
equally probable with this method. 

Obviously, the constraints given here will improve when 
new and high quality data will be available (CHANDRA & 
XMM-Newton). The method proposed should be very use- 
ful when constraining the cosmology with the upcoming new 
data. 



6 CONCLUSIONS 

In this work, we have shown that our method, which com- 
bines different data sets for the cluster population, is a pow- 
erful tool to constrain both, the cosmology and cluster scal- 
ing relations. 

Our method is robust in the sense that neither assumptions 
about the cosmology nor specific cluster scaling relations are 
made a priori. 

Despite the correlations in the theoretical curves, we have 
shown that with simple estimators (like the standard Xjo 



joint 



and the Lahav et al.'s Bayesian estimator) it is possible to 
fit the data without any significant bias. 

The main conclusions of this paper are the following. 
Regarding the cosmology we have shown that only low- 
density (flat and open) models are compatible with the data 
sets considered in this paper. The marginalized probability 
in the (ag — Q m ) space shows a clear peak at the position 
(o-g = 0.8, Q m = 0.3) in both ACDM and OCDM models. 
This is a very interesting conclusion because previous works 
(Eke et al. 1996; Kitayama & Suto 1997; Bahcall & Fan 1998; 
Borgani et al. 1999; Bridle et al. 1999) show a degeneracy 
in these two parameters. This degeneracy is broken when 
considering the five data sets we used in this paper. It is 
important to remark that in Bridle et al. (1999) the authors 
combine cluster abundance, CMB and IRAS data and they 
find values for (ag, Q m ) very close to our best-fitting model. 
It is important to note that this result is compatible with 
the recent determination of the fl m parameter obtained by 
the BOOMERANG team (De Bernardis et al. 2000; Lange 
et al. 2000) and MAXIMA (Hanany et al. 2000; Balbi et al. 
2000). 

The third cosmological parameter, T, is consistent with the 
value obtained from the fit of the power spectrum of galax- 
ies assuming CDM. (Peacock & Dodds 1994, Viana Liddle 
1996) 

Regarding the parameters obtained for the cluster scaling 
relations, they are consistent with empirical determinations 
of such scalings. However, we find a tendency to high values 
in the a exponent which could contradict recent determina- 
tions of such exponent, Nevalainen et al. (2000). However, 
as mentioned in the discussion, we know that there is a bias 
in our estimation of a. Therefore our estimate is compatible 
(within the error bars and the bias) with the virial exponent 
a = 2/3. 

Additional data coming from high redshift clusters (CHAN- 
DRA, XMM-Newton, PLANCK) will improve this result. 
Particularly interesting is the work that can be done with 
future CMB surveys. The PLANCK satellite will explore the 
whole sky at different frequencies (from 30 Ghz to 800 Ghz) 
and with resolutions between 5 arcmin and 30 arcmin. At 
these frequencies and with those resolutions we have shown 
(Diego et al. 2000) that many clusters are expected to be 
observed at high redshift (z > 2) through the Sunyaev- 
Zel'dovich effect (see fig. 0). PLANCK is expected to de- 
tect those clusters with S mm > 30 mjy. The information 
these clusters will provide will be decisive to definitely ex- 
clude many models. As shown for instance in Barbosa et 
al. (1996), Aghanim et al. (1997), Diego et al. (2000), the 
SZE can be considered as a clear probe of the cosmologi- 
cal parameters. In particular, from the previous discussion 
we concluded that we are not able to discriminate between 
ACDM and OCDM models. However, from fig. [| it is evi- 
dent that through the SZE it could be possible to distinguish 
between these two models at a very high confidence level. 
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Figure 7. Prediction of the integrated number counts of cluster 
population al mm wavelengths (353 GHz) (SZE) for the best flat 
(solid) and open (dotted) models in table QJ. In the plot three 
redshift shells are represented: top z < 1, middle z £ [1,2] and 
bottom z > 2. 
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